Automatic Noun Sense Disambiguation

نویسندگان

  • Paolo Rosso
  • Francesco Masulli
  • Davide Buscaldi
  • Ferran Plà
  • Antonio Molina
چکیده

This paper explores a fully automatic knowledge-based method which performs the noun sense disambiguation relying only on the WordNet ontology. The basis of the method is the idea of conceptual density, that is, the correlation between the sense of a given word and its context. A new formula for calculating the conceptual density was proposed and was evaluated on the SemCor corpus. 1 An Extension of the Conceptual Density The task of Word Sense Disambiguation (WSD) consists of examining word tokens and specifying exactly which sense of each word is being used. The WordNet (WN) ontology, based on synsets (sets of synonyms), is the external lexical resource which is often used to perform the WSD task. In most of the WSD approaches, a word is disambiguated along with a portion of the text in which it is embedded, that is, its context. When the initial input source of information (i.e., the word and its context) is processed only together with the lexical knowledge source (e.g. WN), a fully automatic method which does not require any kind of training process is needed to perform WSD. Conceptual Density (CD) is a measure of the correlation among the sense of a given word and its context. The foundation of this measure is the Conceptual Distance, defined as the length of the shortest path which connects two concepts in a hierarchical semantic net. The starting point for our work was the CD formula of Agirre and Rigau [1], which compares areas of subhierarchies: CD(c,m) = ∑m−1 i=0 nhyp i ∑h−1 i=0 nhyp i (1) where c is the synset at the top of subhierarchy, m the number of word senses falling within a subhierarchy, h the height of the subhierarchy, and nhyp the This work was supported by the Spanish Research Projects CICYT TIC2000-0664C02 and TIC2000-1599-C01-01, and the VIDI of the Univ. Politécnica Valencia. A. Gelbukh (Ed.): CICLing 2003, LNCS 2588, pp. 273–276, 2003. c © Springer-Verlag Berlin Heidelberg 2003

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Google & WordNet based Word Sense Disambiguation

This paper presents an unsupervised methodology for automatic disambiguation of noun terms found in domain specific unrestricted corpora. This method extends approaches of Fragos (Fragos et al., 2003) and others that use the WordNet (Miller, 1998) database in order to resolve semantic ambiguity. The method is evaluated by disambiguating the noun collection of SemCor 2.0. Parameter adjustment wa...

متن کامل

WSD and Closed Semantic Constraint

The application-driven construction of lexicon has been emphasized as a methodology of Computational Lexicology recently. We focus on the closed semantic constraint of the argument(s) of any verb concept by the noun concepts in a WordNet-like lexicon, which theoretically is related to Word Sense Disambiguation (WSD) at different levels. From the viewpoint of Dynamic Lexicon, WSD provides a way ...

متن کامل

Word Sense Disambiguation Using Vectors of Co-occurrence Information

This paper reports on the word sense disambiguation of Korean noun by using co-occurrence information in context. For a given noun, its local contextual word distribution is not enough to express their semantic characteristics for noun sense disambiguation. This paper proposes a cluster-based sense as a base vector. Contextual noise is removed by a term weighting method, and hypernyms of remain...

متن کامل

Automatic WSD: Does it Make Sense of Estonian?

This paper describes a fully automatic Estonian word sense disambiguation system called semyhe which is based on Estonian WordNet (EstWN) hyponymjhypernym hierarchies and meant to disambiguate both nouns and verbs. 1 Short description of the system The main inspiration for our system is Agirre and Rigau (1996) similar system that disambiguates the English noun senses based on WordNet hyponymjhy...

متن کامل

A Proposal for Word Sense Disambiguation using Conceptual Distance

This paper presents a method for the resolution of lexical ambiguity and its automatic evaluation over the Brown Corpus. The method relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a Conceptual Density formula developed for this purpose. This fully automatic method requires no hand coding of lexical entries, hand t...

متن کامل

Word Sense Disambiguation using Conceptual Density

This paper presents a method for the resolution of lexical ambiguity of nouns and its automatic evaluation over the Brown Corpus. The method relies on the use of the widecoverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a Conceptual Density formula developed for this purpose. This fully automatic method requires no hand coding of lexical entries...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003